11. Evaluating Performance as a Clinician

Summary & Exercise Instructions

Evaluating Performance as a Clinician

Below, Mazen will present a clinician’s perspective on assessing the performance of assistive systems in general, not only segmentation models. It is important to understand this perspective for a data scientist so that you can speak with clinicians in common terms.

ND320 C3 L3 09 Performance Evaluation

Summary

Note how I am talking about performance in a different sense. As a clinician, I need to make decisions about the presence of conditions or selecting the course of treatment. For that, clinicians operate in terms of Likelihood Ratios.

The likelihood ratio for a diagnostic test result can be calculated if the predictive characteristics (sensitivity and specificity) of that test are known. Likelihood ratios are known for common diagnostic tests performed by humans (e.g., correctly identifying viral pneumonia from chest CT scans). This means that for example, your ML segmentation algorithm may be measuring the volume of a specific anomaly in the lung very accurately, but this measurement, while important to quantify the degree of lung involvement by some disease state, may be not specific at all for predicting whether that state is due to a viral pneumonia (e.g., presence of such anomalies could mean viral pneumonia, bacterial pneumonia or non-infectious causes like hemorrhage or edema). Thus, your algorithm with high Dice scores may end up being not very useful to solve a clinical task if the goal is a specific diagnosis.

Quiz - Evaluating Clinical Performance

In which of the following clinical scenarios would accurately measuring the volume of a disease state (and therefore having a high Dice score), be most relevant.

SOLUTION: Quantifying tumor burden over time in a patient with known cancer.